Identification of multiple rare variants associated with a disease
نویسندگان
چکیده
Identifying rare variants that are responsible for complex disease has been promoted by advances in sequencing technologies. However, statistical methods that can handle the vast amount of data generated and that can interpret the complicated relationship between disease and these variants have lagged. We apply a zero-inflated Poisson regression model to take into account the excess of zeros caused by the extremely low frequency of the 24,487 exonic variants in the Genetic Analysis Workshop 17 data. We grouped the 697 subjects in the data set as Europeans, Asians, and Africans based on principal components analysis and found the total number of rare variants per gene for each individual. We then analyzed these collapsed variants based on the assumption that rare variants are enriched in a group of people affected by a disease compared to a group of unaffected people. We also tested the hypothesis with quantitative traits Q1, Q2, and Q4. Analyses performed on the combined 697 individuals and on each ethnic group yielded different results. For the combined population analysis, we found that UGT1A1, which was not part of the simulation model, was associated with disease liability and that FLT1, which was a causal locus in the simulation model, was associated with Q1. Of the causal loci in the simulation models, FLT1 and KDR were associated with Q1 and VNN1 was correlated with Q2. No significant genes were associated with Q4. These results show the feasibility and capability of our new statistical model to detect multiple rare variants influencing disease risk.
منابع مشابه
A Novel Support Vector Machine-Based Approach for Rare Variant Detection
Advances in next-generation sequencing technologies have enabled the identification of multiple rare single nucleotide polymorphisms involved in diseases or traits. Several strategies for identifying rare variants that contribute to disease susceptibility have recently been proposed. An important feature of many of these statistical methods is the pooling or collapsing of multiple rare single n...
متن کاملIdentification of a Novel Splice Site Mutation in RUNX2 Gene in a Family with Rare Autosomal Dominant Cleidocranial Dysplasia
Introduction: Pathogenic variants of RUNX2, a gene that encodes an osteoblast-specific transcription factor, have been shown as the cause of CCD, which is a rare hereditary skeletal and dental disorder with dominant mode of inheritance and a broad range of clinical variability. Due to the relative lack of clinical complications resulting in CCD, the medical diagnosis of this disorder is challen...
متن کاملIdentification of Grouped Rare and Common Variants via Penalized Logistic Regression
In spite of the success of genome-wide association studies in finding many common variants associated with disease, these variants seem to explain only a small proportion of the estimated heritability. Data collection has turned toward exome and whole genome sequencing, but it is well known that single marker methods frequently used for common variants have low power to detect rare variants ass...
متن کاملCRB1-Related Leber Congenital Amaurosis: Reporting Novel Pathogenic Variants and a Brief Review on Mutations Spectrum
Background: Leber congenital amaurosis (LCA) is a rare inherited retinal disease causing severe visual impairment in infancy. It has been reported that 9-15% of LCA cases have mutations in CRB1 gene. The complex of CRB1 protein with other associated proteins affects the determination of cell polarity, orientation, and morphogenesis of photoreceptors. Here, we report three novel pathogenic varia...
متن کاملComparison of collapsing methods for the statistical analysis of rare variants
Novel technologies allow sequencing of whole genomes and are considered as an emerging approach for the identification of rare disease-associated variants. Recent studies have shown that multiple rare variants can explain a particular proportion of the genetic basis for disease. Following this assumption, we compare five collapsing approaches to test for groupwise association with disease statu...
متن کامل